Automatically Utilizing Secondary Sources to Align Information Across Sources

نویسندگان

  • Martin Michalowski
  • Snehal Thakkar
  • Craig A. Knoblock
چکیده

opened the door for new and exciting informationintegration applications. Information sources on the web are controlled by different organizations or people, utilize different text formats, and have varying inconsistencies. Therefore, any system that integrates information from different data sources must identify common entities from these sources. Data from many data sources on the web does not contain enough information to link the records accurately using state-of-the-art record-linkage systems. However, it is possible to exploit secondary data sources on the web to improve the recordlinkage process.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatically and Accurately Conflating Satellite Imagery and Maps

There is a wide variety of geo-spatial data available on the Internet, including a number of data sources that provide satellite imagery and maps of various regions. The National Map, MapQuest, and University of Texas Map Library are good examples of map or satellite imagery repositories. In addition, a wide variety of maps are available from various government agencies, such as property survey...

متن کامل

Exploiting Secondary Sources for Automatic Object Consolidation

Information sources on the web are controlled by different organizations or people, utilize different text formats, and have varying inconsistencies. Therefore, any system that integrates information from different data sources must consolidate data from these sources. Data from many data sources on the web may not contain enough information to accurately consolidate the data even using state o...

متن کامل

ESTIMATING FREIGHT FLOWS FOR METROPOLITAN AREA HIGHWAY NETWORKS USING SECONDARY DATA SOURCES For Presentation National Urban Freight Conference

We present ongoing work on developing an automated integration system for freight flow analysis and planning. To overcome the limitations of current estimation methods for commodity flows, we use reliable secondary sources, including small-area employment data, and derive estimates in a plausible way by means of a computational workflow. When available, we extract the data automatically from on...

متن کامل

Automatically Annotating and Integrating Spatial Datasets

Recent growth of the geo-spatial information on the web has made it possible to easily access a wide variety of spatial data. By integrating these spatial datasets, one can support a rich set of queries that could not have been answered given any of these sets in isolation. However, accurately integrating geo-spatial data from different data sources is a challenging task. This is because spatia...

متن کامل

Aligning Multilingual Thesauri

The aligning and merging of ontologies with overlapping information are actual one of the most active domain of investigation in the Semantic Web community. Multilingual lexical ontologies thesauri are fundamental knowledge sources for most NLP projects addressing multilinguality. The alignment of multilingual lexical knowledge sources has various applications ranging from knowledge acquisition...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • AI Magazine

دوره 26  شماره 

صفحات  -

تاریخ انتشار 2005